Notes on Austin's multiple ergodic theorem 



Thierry de la Rue 
Abstract 

The purpose of this note is to present my understanding of Tim Austin's proof of 
the multiple ergodic theorem for commuting transformations, emphasizing on the use 
of joinings, extensions and factors. The existence of a sated extension, which is a key 
argument in the proof, is presented in a general context. 

1 Introduction 

The norm convergence of multiple ergodic averages for commuting transformations (Theo- 
rem [TTT] below) was first proved in 2008 by Terence Tao [8]. We intend to present here the 
quite different proof proposed by Tim Austin [I] using the machinery of joinings, extensions 
and factors. This text is written after Austin's talk at the conference Dynamical Systems and 
Randomness, (held in Paris, Institut Henri Poincare, May 2009), and a short conversation 
with him following his talk. The major part is, as far as I understand it, quite faithful to 
Austin's original proof. The only slightly original contribution is the proof of the existence 
of a sated extension, which is presented in a general context. 

Theorem 1.1. Let d > 1, and T\,...,Td be d commuting, measure-preserving invertible 
transformations of the standard Borel probability space (X,g/,fi). Then for any choice of 
fii ■ ■ ■ j fd £ the multiple ergodic averages 

1 N 

n=l 

converge in L 2 ^) as N — ► oo. 
The strategy 

The case d = 1 corresponds to the standard ergodic theorem of Von Neumann, and in this 
case the limit is clearly identified, as the orthogonal projection of the function f\ on the 
subspace of L 2 -functions which are measurable with respect to the factor u-algebra 

:= | .A £ stf : fi(AAT^A) = o}. 

(This factor <r-algebra is called the isotropy factor by Austin; isotropy factors play a crucial 
role here and we will use the above notation for several transformations in the sequel.) 

The proof for d > 2 is presented by induction on d: We assume that d > 2 is such that 
Theorem 11.11 has already been proved up to the case of d — 1 commuting transformations. 



Then we identify a simple class ^ of systems (X, s4 ', /i, T±, . . . , T4) with d commuting trans- 
formations for which the desired result is easily deduced from the (d — l)-case. Next step 
consists in the introduction of a larger class of systems, the so-called -sated systems. (Note 
that the notion of satedness is not explicit in pQ: It has been formalized in a subsequent 
work by Austin [2] dealing also with some polynomial sequences.) The ^-sated systems are 
characterized by a quite simple structure of their joinings with any "^-system, and this enables 
us to prove for them the theorem, using the induction hypothesis and a version of Van der 
Corput lemma. Finally, and this is the point where the machinery of joinings plays its crucial 
role, we show that any system possesses an extension which is ^-sated. Since we know that 
the theorem holds for ^-sated systems, it obviously holds for all their factors, hence for all 
systems. 

For the sake of simplicity, we first present the induction step passing from one to two 
commuting transformations, admitting the existence of a ^-sated extension for any system. 
Then we will see how the same argument can be generalized to pass from d — 1 to d transfor- 
mations. Finally, in a completely independent section, we prove a general result on joinings 
showing why any system admits a ^-sated extension, achieving the proof of Theorem 11.11 

2 The case of two commuting transformations 

In all this section, we assume d = 2 and we present the argument showing how the theorem 
for two commuting transformations can be proved, using the well-known result in the case of 
a single transformation. 

2.1 ^-systems 

We first observe that there are two very simple cases in which the convergence in 1? of the 
ergodic averages 

1 N 

^E/l° r "/2°^ (1) 

n.=l 

is a trivial consequence of the single-transformation case: 

• If T\ = Id, which amounts to saying that the isotropy factor is the whole a- 
algebra stf : The above ergodic average reduces to f\ times an ergodic average for the 
single transformation Obviously, it is enough that /1 be J^ Tl -measurable to get this 
reduction. 

• If T\ = T 2 , in other words if the isotropy factor J^ T2T i is the whole cr-algebra : Then 
dl]) reduces to an ergodic average for the product f\fi and the single transformation 
T\ = T2. Note that this reduction holds as soon as f\ is measurable with respect to 

,j?T 2 T-\ 

Now, we introduce the class of ^-systems as the class of systems X = (X, /i, T\, T2) 
for which 

m rri ri i — 1 

In other words, X is a "^-system if it is isomorphic to a joining of two systems of the form 
Xi = (X±, =0i, //l, Id, S) and X2 = (X2, ^2, 1^2, T, T). In a system of the class ^, any bounded 
measurable function f\ can be arbitrarily well approximated in L 2 by a finite sum of products 



2 



of the form gh, where g is J^ Tl -measurable and h is J^ T27 i -measurable. For each such term 
gh, we can simultaneously apply the two reductions explained above, and we get that the 
L 2 -convergence of the ergodic averages (pQ) holds in any ^-system. 

2.2 ^-sated systems 

2.2.1 Looking for characteristic factors 

In any system X = (X, fi, Ti, T2), it is now natural to consider the factor cr-algebra 

7-7-1 7-7-1 (-7-1 — 1 

:= J Tl V . 

Considering the action of T\,T<i on this factor, we obviously get a ^-system which we also 
denote by X^. It is straightforward to check that, in fact, this factor is the largest factor in 
X (in the sense of inclusion of cr-algebras) on which the action of the transformation gives 
rise to a ^-system: We call it the largest -factor of X. 

Observe that if f\ is measurable with respect to X<^, the same argument as above proves 
the convergence in I? of the ergodic averages ([T|). Now, we are looking for simple conditions 
on X ensuring that, when studying the convergence of these ergodic averages, we can replace 
fi by its projection E[/i|X^], which would immediately lead to the desired conclusion. In 
other words, we are looking for conditions implying 



N 



n=l 



L 2 



as soon as E[/i|X^ 



0. 



(2) 



(Although we shall not explicitely use here the notion of characteristic factors, we can note 
that the above condition is equivalent to "(X^,^) is a pair of characteristic factors for the 
convergence of the ergodic averages ([I])", see Definition 4.1 in pp.) 

2.2.2 Van der Corput lemma 

To obtain the convergence to in ([2]), we will make use of the following lemma. 
Lemma 2.1 (Van der Corput). Let (u n ) be a bounded sequence in a Hilbert space. If 

H , N 



lim lim — V — V < u n u n+h >= 0, 



h=l n=l 



then 



lim 

N— >oo 



1 N 

N E 



11, 



n=l 



0. 



For the sake of completeness, a proof of this lemma is included in Annex A. 

2.2.3 A sufficient condition for the convergence 

In view of Lemma 12.11 we are led to study the expression 

1 H 1 N f 

ItE^E/ /i°^/ 2 oT 2 VioT^+V2°T 2 " + V 

h=l n=l Jx 
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Using the invariance of /j, with respect to T\, we can rewrite each integral in the form 

fi Ti o Ti (f 2 J 2 o T%) o (TaTf 1 )"^. 
For each fixed h, the usual ergodic theorem for the single transformation T 2 T X ^ gives 

i N r 

J im mYI / /i/i° T i l (/2/2 o ^)o(^r 1 r* 

N^oo iv ' — ' / v 
n=r A 



Now, it is convenient to view the latter integral as 



XxX 



fi(xi) h{T^x\) f 2 (x 2 ) h^ 2 x 2 ) d(ji ® JfT2T -i n)(xi,x 2 ), 



(3) 



where pi ® ^t 2 t~ x A* denotes the relatively independent self-joining of X over its factor a- 

algebra J^ 2T i . Remark that this probability distribution is invariant under the action of 
the transformation T\ := T\®T 2 : (xi,x 2 ) i— ► (T\X\, T 2 x 2 ). Indeed, if (pi and 4> 2 are bounded 
measurable functions on X, we have 



XxX 

E 
E 



c^i(TiXi) (j) 2 {T 2 x 2 )d(n <g) ^ ToT -i m)(xi,x 2 ) 



A 



E 



A 



li o Ti|^ T2T i '] eU 2 ° T 2 |./ :zi,t i 1 d/i 



|j^ T2T i oT 2 d[i(x) (because J^ T2T i is invariant 
by both transformations) 

\jT2T- 1 T x dn(x) (since on J T * T ^ , Ti and T 2 
coincide) 



E 



A 



I XxX 



(f>i(xi) (t) 2 {x 2 )d{iJ. ® T T -i fj,)(x 1 ,x 2 ) 
I 1 



Of course, fi ®^t 2 t~ 1 1S a ^ so invariant by the transformation T 2 := T 2 T 2 . This gives us a 

big system with two commuting transformations X := (^X x X, ® =2/, ®^ T2T -i /x, T\,T 2 J ■ 

Observe that our original system X is a factor of X, which is obtained by considering only 
the first coordinate. 

We have to average (|3j) over h. Writing this average in X gives 

h ® h{x\,x 2 ) — ^/i®/ 2 (Ti (xi,x 2 )) (8)^ TaT -i /i)(xi,x 2 ). 
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Applying the usual ergodic theorem for T\ = T\ ® T 2 in X, we see that the latter expression 
converges, as H — > 00, to 



L 



h ® fa ^ 



XxX 



h®fa 



d{fjL ® 



An obvious sufficient condition for this limit to vanish is E x [/1 ® fa | J> Tl ® T ' 2 j = 0. The 
following proposition links this condition with the largest ^-factor X<g> of X. 



Proposition 2.2. If E x 



X* 



0, then E£ [/1 ® fa \ 



®T 2 ~ 



0. 



Proof. Observe that .y Tl is a "^-factor of X, hence we have J^ 2 " 1 C X<^» by definition of 
the maximal ^-factor. Therefore, a sufficient condition for the above conclusion to hold 



is E^. 



X%f 



= 0. Moreover, we can also note that (xi,x 2 ) 1— ► fa(%2) is also X>^- 
measurable: Indeed, on the a-algebra generated by the second coordinate, the transformations 
T\ and T 2 coincide, hence this a-algebra is itself a 'rf -factor of X. We can then write 



E, 



x* 



f 2 (x 2 )E x /i(xi 



X<yf 



from which the proposition follows immediately. 



□ 



2.2.4 ^-sated systems 

From the above reasoning, we see that if our system X is such that 



E x [A |X»] = 



E x 



X* 



(4) 



then the convergence in L? of the ergodic averages ([T]) holds. Observe that in the RHS of (jU, 
we are considering together two important factors of the big system X: On the one hand 
the factor generated by the first coordinate (which, as already mentioned, is nothing but the 
original system X), and on the other hand the largest ^-factor X<^ of X. In other words, 
we are considering a joining of our system X with the ^-system X^>. This motivates the 
following fundamental definition. 

Definition 2.3. The system X is said to be ^-sated if, for any joining A of X with a ^-system 
Y and any bounded measurable function f on X, we have 



E A [/(x) |Y] =E A E x [/(x) \ X V ] 



(5) 



In other words, the system is -sated if any joining of X with a ^-system is relatively 
independent over the largest -factor X<^ of X. 

Of course, any ^-sated system satisfies (j3J). Hence, a partial conclusion up to this point 
can be stated as follows: 

Proposition 2.4. If X = (X, gf, [i, T±, T 2 ) is a % '-sated system, then the ergodic averages ([T]) 
converge in I? for any choice of fa and fa in L°° (/j,) . 
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3 The case of d commuting transformations 



In this section, we assume that d > 2 is such that Theorem 11.11 has already been proved in 
the case of d — 1 commuting transformations, and we adapt the arguments of the preceding 
section to see how to prove the ergodic theorem in the case of 'rf-sated systems of d commuting 
transformations . 

In this general case, we define the class of ^-systems as the class of systems X = 
(X, fi, T\, . . . , T d ) for which 

rri rri rri — 1 rri rri — 1 

i = /'V ^ T2T i V • • • V J TdT ^ . 

A factor <r-algebra of a system X = (X, &?, fi,T%, . . . , Td) on which the action of T\, . . . , T d 
defines a ^-system will be called a %P -factor of X. In any system X there always exists a 
largest <rf -factor 

rri rri rri — 1 ri-, rri — 1 

X v := J Tx V J T ' lT ^ V • • • V J TdT ^ . 

(X<^ is defined as a factor <r-algebra of X, but we will use the same notation to denote the 
"^-system obtained by considering the action of Ti, . . . , T d on this sub-a-algebra.) 

The same argument as in the case d = 2 proves that convergence in L? of ergodic averages 



1 N 



(6) 



n=l 



reduces in the class of "^-systems to the case of d— 1 commuting transformations, and in fact 
it is enough for this reduction to be valid that f\ be measurable with respect to X^>. Hence 
we are just looking for conditions ensuring that we can replace f\ in © by its projection 
E x [/i |X»]. 

The class of ^-sated systems is defined word for word as in Definition 12.31 Assuming now 
that X = (X,g/,n,Ti, . . . ,T d ) is a ^-sated system, we have to show that for any choice of 
/i, . . . ,/ d G L°°(jjt), 



Ex[/i |X»] =0 



H N 

j£. & h E Ji E x A o i? ■ ■ ■ A» i? A = rf +k 



f d oT% +h d» = 0, (7) 



which in turn, by Lemma \2.1\ implies 



1 N 



71=1 



L 2 



N^oo 



0. 



3.1 Furstenberg self-joining 

Since we have assumed the validity of Theorem II. II for d— 1 commuting transformations, the 



averages 



TV iV 

a?E/ 9i°T?g 2 oT?-..g d oT2dii = -J2 9l g 2 o {T 2 T^) n ■ ■ ■ g d o {T d T^) n 
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converge for any choice of <?i , • • • , <?d in L°° (//) . Applying this convergence in the case of 
indicator functions gi = 1/^, i = l,...,d, it is standard to see that the limit defines a 
probability measure A on X d by the formula 



\(Ai x • • • x 



1 N f 

A/— >oo iv — ' / v- 



x) dfi(x). 



Moreover, it is straightforward to check that A enjoys the following properties: 

• A is invariant by the transformation Tj <g> • • • <8> Tj for all i = 1, . . . , d; 

• The d marginal distributions of A are equal to /u; 

• A is also invariant by the transformation T\ ® T 2 <S> ■ ■ ■ <8> Irf. 

The above first two properties together mean that A is a d-fold self-joining of the system 
X = (X, sd ', /i, Ti, . . . , Trf). this self-joining was introduced by Furstenberg in [3], and therefore 
refered to as Furstenberg self-joining by Austin. 

As in the case of two commuting transformations, we now define a big system 



X 



(A^^A,^,^,...,^), 



where T\ := T\ <8> T2 ® ■ ■ ■ <8> T d and, for 2 < i < d, Ti := Ti <g> Tj (g> ■ ■ ■ 2$. Observe by 
considering the first coordinate that X is a factor of X. Note also that, for 2 < i < d, T\ 
and Tj coincide on the sub-cr-algebra s&i generated by the i-th coordinate, from which we can 
deduce 

^2®-8^C Xy. (8) 
We now turn back to ([7]). For any fixed h, we have by definition of Furstenberg self-joining 

N 



111 [ h°T T i---fdoT c [hoT^ h ...f d oT2 +h dn 

n=l JX 



N^oo 



X d 



tfx®>~®fd)tf\®---®fd°T x )d\. (9) 



Averaging the latter expression over h € {1, . . . , H}, and letting H go to infinity gives, using 
the mean ergodic theorem for the single transformation T\, 



X' 1 



(A® 



fd) E x 



h ® • • ■ ® /d 



(10) 



Assume now that Ex [/1 | X^] = 0. Then if X is a ^-sated system, we have 



E x 



Recalling ©, we get 
E x [fx $ 

and since C X«>, 



X<^ 



A 



= E x [/(x) |Xy]=0. 
f2(x 2 ) ■ ■ ■ f d (x d ) E x 
/d ^1 =0 



X%f 



0. 



which proves 0. 
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We thus have proved the following partial result: 

Proposition 3.1. If the statement of Theorem \l.l\ is valid for d — 1 commuting transforma- 
tions, then it is also valid for any -sated system of d commuting transformations. 

4 Existence of ^-sated extensions 

The purpose of this section is to prove the existence of a ^-sated extension for any dynamical 
system, in a general context including the case we need to finish the proof of Theorem ll.il An 
important part of the arguments used below was developped in [7] for the study of another 
class of systems, namely the class of all factors of all countable self-joinings of a given system. 
But as mentionned in [3], they work in a quite general setting which we present here in details. 

From now on, let ^ denote a class of dynamical systems, which we always assume to 
be stable under taking isomorphisms. As before, we call -factor of a dynamical system 
(X, gf, fx, T\ , . . . , Td) any factor sub-cr-algebra on which the action of T\_, . . . defines a 
system in the class ^ . In the particular case of class ^ used in the preceding sections, it was 
quite obvious to see that any system admits a largest -factor, this is in fact a general result 
provided a stability assumption on ^ . 

Lemma 4.1. If the class ^ is stable under taking countable joinings, then any system X 
admits a largest -factor, which we denote by X<^ . 

Proof. We just set 

X^ := {A E srf : A belongs to some ^-factor of X} , 

and we claim that it is a ^-factor. Since {X, s# ', /i) is a standard Borel space, the cr-algebra 
si equipped with the metric d(A, B) := fi(AAB) is separable (where we naturally identify 
subsets A and A' of X when fi(AAA') = 0). Therefore there exists a countable family (Ai)i £ j 
dense in X<^, and for each i there is some ^-factor J^j containing Ai. Since the class ^ is 
stable under taking countable joinings, & := Vie/ ^ i ^ s itself a ^-factor. By density, we have 
X<y C & but, since X<^ contains every ^-factors, we have X<^» = □ 

If ^ is stable under taking countable joinings, we can thus repeat Definition 12.31 in this 
more general setting: 

Definition 4.2. The system X is said to be ^-sated if any joining of X with a -system is 
relatively independent over the largest -factor X^ of X. 

Proposition 4.3. If the class & is stable under taking countable joinings and under taking 
factors, then any system X is % '-sated. 

The proof is based on a fundamental lemma, published simultaneously in two papers [3 [6], 
stating that if two systems X and Y are not disjoint, then X possesses a non-trivial common 
factor with a joining of countably many copies of Y. We slightly rephrase this lemma in order 
to make it more convenient for our purposes: 

Lemma 4.4. Let X be a joining of two systems X = (X, s/, fx, (Tj)) and Y = (Y, S3, v, (Sj)), 
and let g be a bounded measurable function defined in Y. Then there exists a factor sub- 
o -algebra & in X such that the action of T\ , . . . , T^ on & is isomorphic to a factor of some 
joining of countably many copies of Y, and satisfying 

E A [g(y) | X] = E A [g(y) \ J? ® {0, Y } } . (11) 
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Proof. We consider a countable family of copies of the dynamical system defined by the joining 
A, and consider their relatively independent joining Aqo over their common factor X. Then 
Aqo is a probability measure on the space X x Y N , which is easily seen to be invariant under 
the shift transformation on each x-fiber, a : (x, yo,yi,U2, ■•■)>-> (x, yi, 1/2, 2/3, ■ • •)• Moreover, 
Aqo conditioned on each such fiber is a product measure. A relative version of Kolmogorov 
0-1 law (see e.g. [6], Lemma 9) gives that, modulo Aqo, the cr-algebra J^ CT of shift-invariant 
events coincides with the u-algebra stf ® {0,1^} generated by the x coordinate. Consider 
now a bounded measurable function g on Y, and set goo(,x,(y n )) := g{yi) for x £ X and 
(Vn) £ . Applying the ergodic theorem in the dynamical system (X x Y N , Aqo, a) to the 
function g^, we obtain 

N 

^Y,9(yn)~^^x x boo |^]=E Aoo [g( yi ) |X], 

iv ' — * N— >oo 

n=l 

and by definition of Aqo the latter is equal to E A [g(y) | X]. Hence, IE A [g{y) | X] coincides 
modulo Aoo with a function which is measurable with respect to (y n )n£¥i- It follows that the 
factor & of X generated by IE A [g(y ) | X] is isomorphic to a factor of the joining of countably 
many copies of Y obtained by considering the (y n )-coordinates in Aqo. Finally, with this 
definition of we obviously have (jlip . □ 

Proof of Proposition Let X be any dynamical system, and A be a joining of X with a 
^-system Y. For a given bounded measurable function g defined in Y, let & be the factor 
sub-u-algebra given by Lemma I4.4L By stability of ^ under taking countable joinings and 
factors, & is a ^-factor of X, and & is therefore contained in the largest ^-factor X<^. 
Equation (fTTI) then gives 

E A [<?(y) |X]=E A [ 5 (y) |X»®{0,y}], 

and this equation means that in the joining A, X and Y are relatively independent over X<^. 
This proves that X is ^-sated. □ 

The class ^ of dynamical systems which is used in Section [21 and its generalization in 
Section [31 are easily proved to be stable under taking countable joinings, but unfortunately 
they are not stable under taking factors (see Annex B). 

This is why it is necessary in general to pass to extensions to get ^-sated systems. The 
remaining of the section is devoted to the proof of the following theorem. 

Theorem 4.5. Let ^ be a class of dynamical systems which is stable under taking countable 
joinings. Then any system admits a % '-sated extension. 

For ^ satisfying the hypothesis of the above theorem, we start by introducing the class 
%f consisting of dynamical systems which are factors of ^-systems. Obviously ^ is stable 
by taking factors, and we can also check that ^ is stable under taking countable joinings. 
Indeed, let Z be a joining of a countable family (Xj)j G / of "^-systems. For each i, let Xj be 
a ^-extension of Xj, and define Yj as the relatively independent joining of Xj and Z over 
their common factor Xj. Then, consider the relatively independent joining Z of the Yj's over 
their common factor Z. In Z, each factor Xj of Z is identified with a factor of Xj, hence Z 
itself, which is generated by all the Xj's, is contained in the cr-algebra generated by the Xj's. 
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Z is thus a factor of the joining of the Xj's defined by Z, and since ^ is stable under taking 
countable joinings, this joining is a ^-system. 

In any system X = (X, gf, fi, (Tj)), there exist therefore a largest 'rf -factor X^, and a 
largest ^-factor X^. Since any ^-system is obviously a ^-system, C X^p. 

Proposition 4.6. X is ^ '-sated if and oniy if X^» = X^. 

Proof. The if part is a direct corollary of Proposition 14.31 applied to the class ^. Conversely, 
let us assume that X is "r^-sated. Let Y be a ^-extension of X^, and consider the relatively 
independent joining of X and Y over their common factor X^: Since X is "^-sated, this 
joining is relatively independent over X<g>. But this is only possible if X^7 c X^. □ 

Proof of Theorem \4-5\ We use the same construction as above: Given a dynamical system X, 
we consider its largest ^-factor X^, a 'rf -extension Y of X~r, and the relatively independent 
joining of X and Y over their common factor X~r. Let us denote by Z the latter system: Z 
is the extension of X which will be proved to be ^-sated. For this, by Proposition 14.61 it is 
enough to establish that the largest ^-factor of Z is Y: Since Y is a ^-system, this will give 



Z = X® X ^Y 




X Y 




Let us consider a joining A of Z with a "^-system W. Since the joining of Y and W 
induced by A is still a ^-system, Proposition 14.31 ensures that, inside A, X and (W V Y) are 
relatively independent over X^. Hence X and W are relatively independent over Y, and 
finally Z and W are relatively independent over Y (because Z is generated by X and Y). 
We thus have proved that any joining of Z with a ^-system is relatively independent over 
Y. Taking in particular the relatively independent joining of Z with a "^-extension of Z^r, 
we see that this is only possible if Z^p c Y. Since the converse inclusion obviously holds, this 
concludes the proof. □ 
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Annex A. Proof of Van der Corput Lemma 

Here is a proof of Lemma |2.U First, observe that, since the sequence (u n ) is bounded, for 
any H we have 

1 N 1 H 



i N 

N E 



n=l 



N ^ H 

n=l h=l 



+ 0(H/N). 



Using the classical inequality (a + 6) 2 < 2(a 2 + b 2 ), then the triangular inequality and finally 
the Cauchy- Schwartz inequality in the form (1/N a n ) 2 < l/iVEi a ni we S e ^ 



1 N 



71=1 



+ 0(H 2 /N 2 ^ 

2 



< 



< 



n=l ft=l 
\ n=l ft=l / 



2 /iV 2 ) 



N 



N 
71=1 



1 H 

E^ 



+ 0(H 2 /N 2 ' 



h=l 



We now have to estimate 

I N I H 

n=l h=l 



1 N 1 H 1 H 

^ E 77 E 77 E < U n+h,Un + h> > ■ 



N ^ H ^ H 

n=l h=l h'=l 



We split the RHS into three pieces Ph=h', Ph<h', and Ph>h', corresponding respectively to the 
terms where h = h' , h < h' , and h > h' . The first piece is simply controlled by choosing H 
large enough: 

1 N 1 H 

Ph=h> = jyEf2E < u n+h ,u n+h >= 0(1/ H). 
n=l h=l 

The second and third pieces are treated with the same computation, we only detail here the 
case (h < h'): 



Pi 



1 H 1 H 1 N 
\<h> = 77 E 77 E 77 E < Un+h ' Un+h ' 



H ^ H ' N 

h=l h'=h+l n=l 



> 



H j ii-ft ^ JV 

77 E w E 77 E < ^> u ^> > +°( H / N ) 



H ^ H ^ N 

h=l h'=l ra=l 



j ^ ft ^ JV 

E E a? E < «»> u "+" > +°( H / N ) 



H ^ H ^ N 

h=l h'=l n=l 



Fixing H large enough, the hypothesis then implies that Ph<h' can be made arbitrarily close 
to zero when TV — > co, which achieves the proof. 
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Annex B. A factor of a ^-system is not always a ^-system 

Here is an example showing that the class defined in Section is not stable under taking 
factors. For each a 6 T := IR/Z, let us denote by R a the translation on T: x i— > x + a 
mod 1, and by fi the Haar measure on T. For some fixed irrational a, we consider the system 
X = (TxT,(i0 n,Ti,T 2 ) where Ti := (8) i?2«, and T 2 := R 2a <8> R2a- Denoting by x 
(respectively y) the first (respectively second) coordinate on T x T, we observe that any 
function ol2x — y mod 1 is invariant by T\ , hence is measurable with respect to the "if -factor 
X<^>. Observe also that on the cr-algebra generated by y, T\ and T 2 define the same action, 
hence any function of y is also measurable with respect to the ^-factor X^. It follows that 
the factor of X generated by 2x mod 1 = (2x — y) + y mod 1 is contained in X^, hence is 
a factor of a ^-system. However, the action of (T\,T 2 ) restricted to this factor is isomorphic 
to the action of (R 2a , i?4 Q ) on T. The latter is certainly not a ^-system, since both j^ R ' 2a 
and ^ R ^ R 2 a are trivial. 



References 

[1] Tim Austin, On the norm convergence of nonconventional ergodic averages, Ergodic The- 
ory Dynam. Systems, to appear. 

[2] , Pleasant extensions subject to some algebraic constraints, and applications, Pre- 
liminary notes available on arXiv:0905.0518, 2009. 

[3] Thierry de la Rue, An introduction to joinings in ergodic theory, Discrete Contin. Dyn. 
Syst. 15 (2006), no. 1, 121-142. 

[4] Harry Furstenberg, Ergodic behavior of diagonal measures and a theorem of Szemeredi on 
arithmetic progressions, J. Analyse Math. 31 (1977), 204-256. 

[5] E. Glasner, J. -P. Thouvenot, and B. Weiss, Entropy theory without a past, Ergodic Theory 
Dynam. Systems 20 (2000), no. 5, 1355-1370. 

[6] M. Lemahczyk, F. Parreau, and J. -P. Thouvenot, Gaussian automorphisms whose ergodic 
self-joinings are Gaussian, Fund. Math. 164 (2000), no. 3, 253-293. 

[7] E. Lesigne, B. Rittaud, and T. de la Rue, Weak disjointness of measure-preserving dy- 
namical systems, Ergodic Theory Dynam. Systems 23 (2003), no. 4, 1173-1198. 

[8] T. Tao, Norm convergence of multiple ergodic averages for commuting transformations, 
Ergodic Theory Dynam. Systems 28 (2008), no. 2, 657-688. 



12 



